A MULTILEVEL PARALLEL AND SCALABLE SINGLE-HOST GPU CLUSTER FRAMEWORK FOR LARGE-SCALE GEOSPATIAL DATA PROCESSING Grant J. Scott and Kirk Backus University of Missouri Center for Geospatial Intelligence Columbia, Missouri, USA
نویسنده
چکیده
Geospatial data exists in a variety of formats, including rasters, vector data, and large-scale geospatial databases. There exists an ever-growing number of sensors that are collecting this data, resulting in the explosive growth and scale of high-resolution remote sensing geospatial data collections. A particularly challenging domain of geospatial data processing involves mining information from high resolution remote sensing imagery. The prevalence of high-resolution raster geospatial data collections represents a significant data challenge, as a single remote sensing image is composed of hundreds of millions of pixels. We have developed a robust application framework which exploits graphics processing unit (GPU) clusters to perform high-throughput geospatial data processing. We process geospatial raster data concurrently across tiles of large geospatial data rasters, utilizing GPU co-processors driven by CPU threads to extract refined geospatial information. The framework can produce output rasters or perform image information mining to write data into a geospatial database.
منابع مشابه
Parallelization of Rich Models for Steganalysis of Digital Images using a CUDA-based Approach
There are several different methods to make an efficient strategy for steganalysis of digital images. A very powerful method in this area is rich model consisting of a large number of diverse sub-models in both spatial and transform domain that should be utilized. However, the extraction of a various types of features from an image is so time consuming in some steps, especially for training pha...
متن کاملEfficient and Scalable Parallel Zonal Statistics on Large- Scale Species Occurrence Data on GPUs
Analyzing how species are distributed on the Earth has been one of the fundamental questions in the intersections of environmental sciences, geosciences and biological sciences. With world-wide data contributions, more than 375 million species occurrence records for nearly 1.5 million species have been deposited to the Global Biodiversity Information Facility (GBIF) data portal. The sheer amoun...
متن کاملAn Agent-based Architecture for Distributed Imagery & Geospatial Computing
Agent-based approaches have not yet been widely applied to highly complex, data intensive,large-scale information processing systems such as are found in the domain of imagery & geospatial computing. Such systems combine diverse and distributed types of imagery and geospatial data, and require collaboration from multiple experts and processing components. This paper gives a description of the d...
متن کاملLarge-Scale Geospatial Processing on Multi-Core and Many-Core Processors: Evaluations on CPUs, GPUs and MICs
Geospatial Processing, such as queries based on point-to-polyline shortest distance and point-in-polygon test, are fundamental to many scientific and engineering applications, including post-processing large-scale environmental and climate model outputs and analyzing traffic and travel patterns from massive GPS collections in transportation engineering and urban studies. Commodity parallel hard...
متن کاملScalable Data Clustering using GPU Clusters
The computational demands of multivariate clustering grow rapidly, and therefore processing large data sets, like those found in flow cytometry data, is very time consuming on a single CPU. Fortunately these techniques lend themselves naturally to large scale parallel processing. To address the computational demands, graphics processing units, specifically NVIDIA’s CUDA framework and Tesla arch...
متن کامل